21 research outputs found

    A Fused Elastic Net Logistic Regression Model for Multi-Task Binary Classification

    Full text link
    Multi-task learning has shown to significantly enhance the performance of multiple related learning tasks in a variety of situations. We present the fused logistic regression, a sparse multi-task learning approach for binary classification. Specifically, we introduce sparsity inducing penalties over parameter differences of related logistic regression models to encode similarity across related tasks. The resulting joint learning task is cast into a form that lends itself to be efficiently optimized with a recursive variant of the alternating direction method of multipliers. We show results on synthetic data and describe the regime of settings where our multi-task approach achieves significant improvements over the single task learning approach and discuss the implications on applying the fused logistic regression in different real world settings.Comment: 17 page

    Taming the BEAST—A Community Teaching Material Resource for BEAST 2

    Get PDF
    Phylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics. The combination of increasingly large genetic data sets and increases in computing power is facilitating the development of more sophisticated phylogenetic and phylodynamic methods. Big data sets allow us to answer complex questions. However, since the required analyses are highly specific to the particular data set and question, a black-box method is not sufficient anymore. Instead, biologists are required to be actively involved with modeling decisions during data analysis. The modular design of the Bayesian phylogenetic software package BEAST 2 enables, and in fact enforces, this involvement. At the same time, the modular design enables computational biology groups to develop new methods at a rapid rate. A thorough understanding of the models and algorithms used by inference software is a critical prerequisite for successful hypothesis formulation and assessment. In particular, there is a need for more readily available resources aimed at helping interested scientists equip themselves with the skills to confidently use cutting-edge phylogenetic analysis software. These resources will also benefit researchers who do not have access to similar courses or training at their home institutions. Here, we introduce the “Taming the Beast” (https://taming-the-beast.github.io/) resource, which was developed as part of a workshop series bearing the same name, to facilitate the usage of the Bayesian phylogenetic software package BEAST 2

    A Phylogeny-aware GWAS Framework to Correct for Heritable Pathogen Effects on Infectious Disease Traits

    Full text link
    Infectious diseases are particularly challenging for genome-wide association studies (GWAS) because genetic effects from two organisms (pathogen and host) can influence a trait. Traditional GWAS assume individual samples are independent observations. However, pathogen effects on a trait can be heritable from donor to recipient in transmission chains. Thus, residuals in GWAS association tests for host genetic effects may not be independent due to shared pathogen ancestry. We propose a new method to estimate and remove heritable pathogen effects on a trait based on the pathogen phylogeny prior to host GWAS, thus restoring independence of samples. In simulations, we show this additional step can increase GWAS power to detect truly associated host variants when pathogen effects are highly heritable, with strong phylogenetic correlations. We applied our framework to data from two different host-pathogen systems, HIV in humans and X. arboricola in A. thaliana. In both systems, the heritability and thus phylogenetic correlations turn out to be low enough such that qualitative results of GWAS do not change when accounting for the pathogen shared ancestry through a correction step. This means that previous GWAS results applied to these two systems should not be biased due to shared pathogen ancestry. In summary, our framework provides additional information on the evolutionary dynamics of traits in pathogen populations and may improve GWAS if pathogen effects are highly phylogenetically correlated amongst individuals in a cohort

    A phylogeny-aware GWAS framework to correct for heritable pathogen effects on infectious disease traits.

    Get PDF
    Infectious diseases are particularly challenging for genome-wide association studies (GWAS) because genetic effects from two organisms (pathogen and host) can influence a trait. Traditional GWAS assume individual samples are independent observations. However, pathogen effects on a trait can be heritable from donor to recipient in transmission chains. Thus, residuals in GWAS association tests for host genetic effects may not be independent due to shared pathogen ancestry. We propose a new method to estimate and remove heritable pathogen effects on a trait based on the pathogen phylogeny prior to host GWAS, thus restoring independence of samples. In simulations, we show this additional step can increase GWAS power to detect truly associated host variants when pathogen effects are highly heritable, with strong phylogenetic correlations. We applied our framework to data from two different host-pathogen systems, HIV in humans and X. arboricola in A. thaliana. In both systems, the heritability and thus phylogenetic correlations turn out to be low enough such that qualitative results of GWAS do not change when accounting for the pathogen shared ancestry through a correction step. This means that previous GWAS results applied to these two systems should not be biased due to shared pathogen ancestry. In summary, our framework provides additional information on the evolutionary dynamics of traits in pathogen populations and may improve GWAS if pathogen effects are highly phylogenetically correlated amongst individuals in a cohort

    Laurent inversion

    Get PDF
    There are well-understood methods, going back to Givental and Hori--Vafa, that to a Fano toric complete intersection X associate a Laurent polynomial f that corresponds to X under mirror symmetry. We describe a technique for inverting this process, constructing the toric complete intersection X directly from its Laurent polynomial mirror f. We use this technique to construct a new four-dimensional Fano manifold

    Dissecting HIV Virulence: Heritability of Setpoint Viral Load, CD4+ T-Cell Decline, and Per-Parasite Pathogenicity.

    Get PDF
    Pathogen strains may differ in virulence because they attain different loads in their hosts, or because they induce different disease-causing mechanisms independent of their load. In evolutionary ecology, the latter is referred to as "per-parasite pathogenicity". Using viral load and CD4+ T-cell measures from 2014 HIV-1 subtype B-infected individuals enrolled in the Swiss HIV Cohort Study, we investigated if virulence-measured as the rate of decline of CD4+ T cells-and per-parasite pathogenicity are heritable from donor to recipient. We estimated heritability by donor-recipient regressions applied to 196 previously identified transmission pairs, and by phylogenetic mixed models applied to a phylogenetic tree inferred from HIV pol sequences. Regressing the CD4+ T-cell declines and per-parasite pathogenicities of the transmission pairs did not yield heritability estimates significantly different from zero. With the phylogenetic mixed model, however, our best estimate for the heritability of the CD4+ T-cell decline is 17% (5-30%), and that of the per-parasite pathogenicity is 17% (4-29%). Further, we confirm that the set-point viral load is heritable, and estimate a heritability of 29% (12-46%). Interestingly, the pattern of evolution of all these traits differs significantly from neutrality, and is most consistent with stabilizing selection for the set-point viral load, and with directional selection for the CD4+ T-cell decline and the per-parasite pathogenicity. Our analysis shows that the viral genotype affects virulence mainly by modulating the per-parasite pathogenicity, while the indirect effect via the set-point viral load is minor

    A Practical Guide to Estimating the Heritability of Pathogen Traits

    No full text
    Pathogen traits, such as the virulence of an infection, can vary significantly between patients. A major challenge is to measure the extent to which genetic differences between infecting strains explain the observed variation of the trait. This is quantified by the trait’s broad-sense heritability, H2. A recent discrepancy between estimates of the heritability of HIV-virulence has opened a debate on the estimators’ accuracy. Here, we show that the discrepancy originates from model limitations and important lifecycle differences between sexually reproducing organisms and transmittable pathogens. In particular, current quantitative genetics methods, such as donor–recipient regression of surveyed serodiscordant couples and the phylogenetic mixed model (PMM), are prone to underestimate H2, because they neglect or do not fit to the loss of resemblance between transmission partners caused by within-host evolution. In a phylogenetic analysis of 8,483 HIV patients from the United Kingdom, we show that the phenotypic correlation between transmission partners decays with the amount of within-host evolution of the virus. We reproduce this pattern in toy-model simulations and show that a phylogenetic Ornstein–Uhlenbeck model (POUMM) outperforms the PMM in capturing this correlation pattern and in quantifying H2. In particular, we show that POUMM outperforms PMM even in simulations without selection—as it captures the mentioned correlation pattern—which has not been appreciated until now. By cross-validating the POUMM estimates with ANOVA on closest phylogenetic pairs, we obtain H2 ≈ 0.2, meaning ∼20% of the variation in HIV-virulence is explained by the virus genome both for European and African data.ISSN:0737-4038ISSN:1537-171

    Automatic generation of evolutionary hypotheses using mixed Gaussian phylogenetic models

    No full text
    Phylogenetic comparative methods are widely used to understand and quantify the evolution of phenotypic traits, based on phylogenetic trees and trait measurements of extant species. Such analyses depend crucially on the underlying model. Gaussian phylogenetic models like Brownian motion and Ornstein-Uhlenbeck processes are the workhorses of modeling continuous-trait evolution. However, these models fit poorly to big trees, because they neglect the heterogeneity of the evolutionary process in different lineages of the tree. Previous works have addressed this issue by introducing shifts in the evolutionary model occurring at inferred points in the tree. However, for computational reasons, in all current implementations, these shifts are "intramodel," meaning that they allow jumps in 1 or 2 model parameters, keeping all other parameters "global" for the entire tree. There is no biological reason to restrict a shift to a single model parameter or, even, to a single type of model. Mixed Gaussian phylogenetic models (MGPMs) incorporate the idea of jointly inferring different types of Gaussian models associated with different parts of the tree. Here, we propose an approximate maximum-likelihood method for fitting MGPMs to comparative data comprising possibly incomplete measurements for several traits from extant and extinct phylogenetically linked species. We applied the method to the largest published tree of mammal species with body-and brain-mass measurements, showing strong statistical support for an MGPM with 12 distinct evolutionary regimes. Based on this result, we state a hypothesis for the evolution of the brain-body-mass allometry over the past 160 million y.Funding Agencies|ETH Zurich [2017-04951]</p
    corecore